Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
🔄 SIMD Programming
Specific
AVX512, Vector Instructions, Loop Unrolling, Auto-vectorization
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
27374
posts in
52.4
ms
APL
Performance
⚡
SIMD
aplwiki.com
·
3d
·
Hacker News
·
…
Geekbench investigates up to 30% jump with Intel's
iBOT
— performance gain attributed to
newly-vectorized
instructions
🧮
Compute Optimization
tomshardware.com
·
2d
·
…
KTransformers
Adds AVX2 MoE Support For Viable Performance On CPUs Without
AMX/AVX-512
⚡
Hardware Acceleration
phoronix.com
·
15h
·
…
philtomson/llama.cpp
: LLM inference in C/C++ (fork of
PrismML
fork that enables CPU (incl AVX2 and AVX512) and ROCm for AMD GPUs
🦙
Ollama
github.com
·
4h
·
r/LocalLLaMA
·
…
Metal Quantized Attention: pulling M5 Max ahead with
Int8
matrix
multiplication
⚡
Hardware Acceleration
releases.drawthings.ai
·
1d
·
Hacker News
·
…
Adaptive Parallel
Monte
Carlo
Tree Search for Efficient Test-time Compute Scaling
⚡
Vectorized Execution
arxiv.org
·
21h
·
…
MXFP8
GEMM: Up to 99% of
cuBLAS
Performance Using CUDA and PTX
⚡
Glommio
danielvegamyhre.github.io
·
5d
·
Hacker News
·
…
Automated
Multiphysics
For Successful
3D-IC
Design
🔬
Chip Fabrication
semiengineering.com
·
18h
·
…
Supercharging
Redpanda
Streaming with profile-guided optimization
⚡
PGO
redpanda.com
·
1d
·
…
Intel Delivers Open, Scalable AI Performance in
MLPerf
Inference
v6.0
🏗️
LLM Infrastructure
newsroom.intel.com
·
1d
·
…
facebookincubator/dispenso
: The project provides high-performance concurrency, enabling highly parallel computation.
⚡
Glommio
github.com
·
21h
·
Hacker News
·
…
Why I’m Building a
Database
Engine in C#
🏹
Apache Arrow
nockawa.github.io
·
6d
·
Hacker News
·
…
Analyzing
Geekbench
6 under Intel's BOT
⚙️
Mechanical Sympathy
geekbench.com
·
2d
·
Hacker News
,
r/hardware
·
…
JetStream
3: A modern benchmark for high-performance,
compute-intensive
Web applications
⚡
Systems Performance
blog.chromium.org
·
2d
·
Hacker News
,
Blogger
·
…
Speculative
Decoding: Performance or
Illusion
?
📊
Model Serving Economics
specdecode-bench.github.io
·
6d
·
Hacker News
·
…
MaaS
Updates
🏗️
LLM Infrastructure
hpc-ai.com
·
2d
·
Hacker News
·
…
In-depth: Google
TurboQuant
cuts LLM memory 6x,
resets
AI inference cost curve
🧠
LLM Inference
digitimes.com
·
6d
·
…
Accelerate
Token Production in AI
Factories
Using Unified Services and Real-Time AI
🖥
GPUs
developer.nvidia.com
·
1d
·
…
Boost Training Goodput: How Continuous Checkpointing Optimizes Reliability in
Orbax
and
MaxText
💾
Database Checkpointing
developers.googleblog.com
·
2d
·
…
Looking into a Pixel by
Nonlinear
Unmixing
-- A Generative Approach
🧩
MoE
arxiv.org
·
21h
·
…
Loading...
Loading more...
Page 2 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help